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Abstract. Probabilistic Boolean networks (PBNs) is a widely used computa¬ 
tional framework for modelling biological systems. The steady-state dynamics 
of PBNs is of special interest in the analysis of biological systems. However, 
obtaining the steady-state distributions for such systems poses a significant chal¬ 
lenge due to the state space explosion problem which often arises in the case of 
large PBNs. The only viable way is to use statistical methods. We have consid¬ 
ered the two-state Markov chain approach and the Skart method for the analysis 
of large PBNs in our previous work. However, the sample size required in both 
methods is often huge in the case of large PBNs and generating them is expensive 
in terms of computation time. Parallelising the sample generation is an ideal way 
to solve this issue. In this paper, we consider combining the German & Rubin 
method with either the two-state Markov chain approach or the Skart method for 
parallelisation. The first method can be used to run multiple independent Markov 
chains in parallel and to control their convergence to the steady-state while the 
other two methods can be used to determine the sample size required for comput¬ 
ing the steady-state probability of states of interest. Experimental results show 
that our proposed combinations can reduce time cost of computing stead-state 
probabilities of large PBNs significantly. 


1 Introduction 

Computational systems biology aims to model and analyse biological systems from a 
holistic perspective with the use of formal, mathematical reasoning and computational 
techniques that exploit efficient data structures and algorithms. Computational mod¬ 
elling allows systematisation of available biological knowledge concerning biochemical 
processes of a biological system and provides formal means for the analysis and under¬ 
standing of real-life systems. Unfortunately, it often arises that the size of the state space 
of the system to be considered is so huge that it prohibits the analysis. Thus compre¬ 
hensive understanding of biological processes requires further development of efficient 
methods and techniques for formal modelling and analysis of biological systems. 

One key aspect of analysing biological systems is to understand their long-run be¬ 
haviour, which is crucial in many contexts, e.g., in the analysis of the long-term in¬ 
fluence of one gene on another gene in a gene regulatory network (CRN) [1]. In this 
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work, we concentrate on the steady-state analysis of biological mechanisms, in partic¬ 
ular GRNs, cast into the framework of probabilistic Boolean networks (PBNs). As in¬ 
troduced by Shmulevich et al. [2] (see [3] for a recent survey), PBNs are a probabilistic 
generalisation of the standard Boolean networks: they not only incorporate rule-based 
dependencies between genes and allow the systematic study of global network dynam¬ 
ics; but also are able to deal well with uncertainty, which comes naturally in biological 
processes. The dynamics of PBNs can be studied in the realm of discrete-time Markov 
chains (DTMCs). Therefore, the rich theories of DTMCs can be applied to the analysis 
of PBNs. 

Given a PBN, one natural and crucial issue is to study the steady-state probabilities 
of its underlying DTMC, which characterise the long-run behaviour of the correspond¬ 
ing biological systems [4]. Much effort has been spent in analysing the steady-state 
behaviour of biological systems for better understanding the influences of genes or 
molecules in the systems, e.g., the ebb and flow of molecular events during cancer 
progression [1]. Furthermore, steady-state analysis has been used in gene intervention 
and external control [5,6,7,8],which is of special interest to cancer therapist to predict 
the potential reaction of a patient. 

It has been well studied how to compute the steady-state probabilities of small-size 
PBNs using numerical methods [9,10]. However, in the case of large PBNs, their state- 
space size becomes so huge that the numerical methods, often relying on the transition 
matrix of the underlying DTMC of the studied network, are not scalable any more. 
This poses a critical challenge for the steady-state analysis of large PBNs. In fact, ap¬ 
proximations with Markov Chain Monte Carlo (MCMC) techniques remain the only 
feasible method to solve the problem. In our previous work [11], we have considered 
the two-state Markov chain approach and the Skart method for approximate analysis of 
large PBNs. Taking special care of efficient simulation, we have implemented these two 
methods in the tool ASSA-PBN [12], and successfully used it for the analysis of large 
PBNs with a few thousands of nodes. However, the trajectory required for analysing 
a large PBN is often very long and generating such long trajectories is expensive in 
terms of computation time. A natural idea for speeding up this is to perform the trajec¬ 
tory generation in parallel. In this work, we consider combining the Gelman & Rubin 
method [13] with the two methods we have considered in [11]. We simulate multiple 
trajectories in parallel and verify the convergence of the trajectories based on the Gel- 
man & Rubin method. Once convergence is reached according to the Gelman & Rubin 
method, either the two-state Markov chain approach or the Skart method can be applied 
to the converged trajectories to compute the steady-state probability for a set of states 
that are of interest. We show with experiments that the combinations can significantly 
reduce the computation time for approximate steady-state analysis of large PBNs. 

2 Preliminaries 

2.1 Finite discrete-time Markov chains 

We define a discrete-time Markov chain (DTMC) as a tuple (S^so^P), where 5 is a finite 
set of states, so ^ S is the initial state, and P : S x S ^ [0,1] is a transition probability 
matrix. For any two states and s', an element of P{s, s') defines the probability that a 
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transition is made from state 5- to state s'. It satisfies that P{s^s') > 0 for all s^s' e S and 
5'') = 1 for all s^S. A path with length ^ is a sequence of states 5*0, ,..., , 

where G 5 for all / G {0,1,...,^ — 1} and P{si^Si^i ) > 0 for all / G {0,1,...,^ — 2}. 
State s' e S is said to be reachable from state s e S if there exists a path from s to s'. A 
DTMC is said to be irreducible if any two states in the state space are reachable from 
each other. A state of a DTMC is of period d such that d equals to the greatest common 
divisor of the lengths of all paths that start and end in the state. If all states in the state 
space of a DTMC are of period one, then the DTMC is aperiodic. A finite state DTMC 
that is both irreducible and aperiodic is ergo die. Let ;r be a probability distribution on S. 
71 is called a stationary distribution of the DTMC if7t = 7l 'P. According to the ergodic 
theory of DTMC [14], an ergodic DTMC has a unique stationary distribution, being 
simultaneously its limiting distribution. It is also known as the steady-state distribution 
given by lim^^oo tzq'P^, where tzq is any initial probability distribution on S and P^ is 
the n times multiplication of the transition matrix P. Therefore, the limiting distribution 
of an ergodic DTMC is independent of the choice of the initial distribution and it can 
be estimated by iteratively multiplying P. 

2.2 Probabilistic Boolean network 

A probabilistic Boolean network G(y, is composed of a set of binary-valued vari¬ 
ables (also referred to as nodes) V = (vi, V 2 ,..., v„) whose values are governed by a list 
of sets ^ .. ^Fn). The set G = {/{, /^,..., } is defined as a set of possible 

predictor functions for node v/, where / G {1,2,..., ^} and i{i) is the number of possible 
predictor functions for v/. Each predictor function /j is a Boolean function defined with 
respect to a subset of nodes referred to as parent nodes of the node v/. At a given time 
point r(r = 0,l,...), one predictor function is selected for each of the nodes. We call 
the combination of all the selected prediction functions at time t a realisation of a PBN. 
Assuming independence among the predictor functions for different nodes, there are 
N = n?=i ^(0 possible realisations for a PBN. We denote the realisations as vectors 
^ G {1,... where the i-th element is the Boolean function selected for node Vj. The 
realisation at time point t is expressed as f{t). For a node v/, the selection probability 
for selecting its jth predictor function is denoted as ^ and it holds that c^^ = 1. 
The state of a PBN at time point t, denoted as s{t), is defined as a collection of all the 
node values at time t, namely s{t) = {vi{t),V 2 {t),.. .^Vn{t)), where Vi{t) is the value of 
node Vi at time t. A PBN with n nodes has 2^ possible states and s{t) G {0, for each 
t. The transition from s{t) to s{t A 1) is conducted by synchronously updating the node 
values according to the realisation at time t, i.e., s{t + 1) = f{t){s{t)). 

The concept of perturbations is introduced to PBN by providing a parameter p G 
(0,1), which is used to sample a perturbation vector y{t) = ( 7 i(r), 72 (^),... , 7 „(^)), 
where each 7 (r) G {0,1} is a Bernoulli distributed random variable with parameter p 
for all t and / G {1,2,... If Yi{t) = 0, the next state of a PBN is given by s{t + 1) = 
f{t){s{t)); otherwise, it is determined as ^’(^ -f 1 ) = s{t) 0 Y{t), where 0 is the ‘exclusive 
or’ operator for vectors. Perturbations allow to reach an arbitrary state from any other 
state within one transition in a PBN. Thus the dynamics of a PBN with perturbations 
can be viewed as an ergodic DTMC over S = {0, [2]. With the ergodic theory of 
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Fig. 1: Conceptual illustration of the idea of the two-state Markov chain construction, 
(a) The state space of the original discrete-time Markov chain is split into two meta 
states: states A and B form meta state 0, while states D, C, and E form meta state 1. The 
split of the state space into meta states is marked with dashed ellipses, (b) Projecting the 
behaviour of the original chain on the two meta states results in a binary (0-1) stochastic 
process which can be approximated as a first-order, two-state Markov chain. 


DTMCs [14], it can be concluded that the long run dynamics of a PEN is independent 
of the choice of its initial state. This allows the estimation of the steady-state behaviour 
of a PEN by performing simulation from an arbitrary initial state. 

The density of a PEN is measured with its predictor functions number and parent 
nodes number. For a PEN G, its density is defined as &{G) = ^ ^(0’ where Np is 

the total number of predictor functions in G and 6{i) is the number of parent nodes of 
the i-th predictor function. 


3 Steady-state Analysis of PBNs 

As shown in [11], both the two-state Markov chain approach and the Skart method are 
effective for analysing steady-state probabilities of a PEN with number of nodes up 
to a few thousands. We briefiy discuss in this section these two methods. An efficient 
implementation of the two methods for the analysis of large PE Ns is available in the 
ASSA-PEN tool [12]. 

3.1 The two-state Markov chain approach 

The two-state Markov chain approach [15] is a method for approximate computation of 
the steady-state probability for a subset of states of a DTMC. This approach splits the 
states of an arbitrary DTMC into two parts, referred to as two meta states. One part is 
composed of the states of interest, numbered 1, and the other part is its complement, 
numbered 0. Such consideration abstracts an arbitrary DTMC into a 0-1 stochastic pro¬ 
cess, which can further be approximated by a first-order, two-state DTMC that consists 
of the two meta states with transition probabilities a and j3 between them. Fig. 1 illus¬ 
trates the construction of a two-state Markov chain from a 5-state Markov chain. 
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Algorithm 1 The Two-state Markov chain approach 
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procedure ESTIMATEPROBABILITY(mo, 

M := rriQ', N := hq; I := M-\-N; 

Generate an initial trajectory of length / abstracted to the two meta states. 

repeat 

Extend the trajectory by M -h A — /. 

/:=M-hA; 


Estimate a, /3 based on the last N elements of the extended trajectory 




apil-a-p) (<I>-'(|(l+.«)))' 

{a+py r2 


until M-hA < I 


Estimate the prob. of meta state 1 from the last A elements of the trajectory. 

end procedure 


The steady-state probability of meta state 1 can be estimated by performing simu¬ 
lation of the original DTMC. This estimation is achieved iteratively using the standard 
two-state Markov chain approach of [15] to guarantee that the samples used for estima¬ 
tion are drawn from a distribution which differs from the the steady-state distribution 
at most by £ and that two precision requirements (precision r, and confidence level s) 
are satisfied. We outline the steps in Algorithm 1. The two arguments rriQ and no are 
the initial ‘burn-in’ period and the initial sample size, respectively. In each iteration of 
the algorithm, the ‘bum-in’ steps M and the actual sample size A are re-estimated. The 
iteration continues until the estimated sample size (MH-A) is not bigger than the current 
trajectory length. For more details on this approach, a derivation of the formulas for M 
and A in line 8, and a discussion regarding a proper choice of no, we refer to [11]. 


3.2 The Skart method 

Proposed by Tafazzoli et al. [16] in 2008, the Skart method is a non-overlapping batch 
mean method that can be used to estimate the steady-state probabilities of a DTMC 
from a simulated trajectory of the DTMC. It divides the simulated trajectory of length 
7], i.e., {Xi : i = 1,2,..., T]}, into p non-overlapping batches, each of size K. See Fig¬ 
ure 2a for an illustration. Assuming p and K are both large enough, it guarantees that 
the batch means are approximately independent and identically distributed (i.i.d) nor¬ 
mal random variables. The grand mean of the individual batch means, denoted as /i, is 
considered as a point estimator of px, i-e., the expected value of Xf. In practise, some 
initial batches, known as ‘burn-in’ steps and denoted as are discarded to eliminate the 
initialisation bias when computing the point estimator px- The method then constructs 
a Cl (confidence interval) estimator for px that is centered on p. The key process of the 
Skart method is to determine a proper batch size K, a proper batch number p, and proper 
number of ‘bum-in’ steps so that the computed steady-state estimations are approx¬ 
imately the theoretical ones and the computed Cl estimator satisfies certain precision 
requirements. This is achieved by using randomness test, autocorrelation, and skew¬ 
ness adjustment in an iterative way. We summarise the process of the Skart method 
in Algorithm 2, and we refer to [16] for a more detailed description. Given two input 
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K K K 



rj = K^p 

(a) A single chain is divided into p batches 


K K K 



K"* \p/(0\ 


(b) ft) chains are divided into p batches 


Fig. 2: Demonstration of dividing samples into batches for the Skart method, (a) Divid¬ 
ing samples from a single chain into p batches, each of size k. The chain size rj = K^p. 
(b) Dividing samples from ft) chains into p batches, each of size K. The size of each 
chain is K* * [p/ft)]. The actual number of batches after dividing is p' = ft) * [p/ft)]. 


Algorithm 2 The Skart method 

1: procedure estimateCI(//* , a) 

2: ?7 := 1280; / := 1024; pass := EALSE; 

3: Generate an initial trajectory of length rj . 

4: Compute the skewness B of the last / samples and set batch size K based on B. 

5: p:=ri;ri := K^p; 

6: Extend trajectory to rj and compute randomness test statistics C 

7: while Independence test is not passed do 

8: Adjust batch size K, number of batches p and spacer [17] size d; rf := K ^ p 

9: Extend trajectory to rj and compute randomness test statistics C 

10: end while 

11: ^ : = d^K; skip first ^ samples. 

12: repeat 

13: Extend the trajectory to length rj if necessary. 

14: Compute nonspaced batch means p and variance estimator Var. 

15: Compute skewness and autogression adjusted Cl based on Var and a. 

16: H = max{n - Clbonom, Chop “At}; 

17: if 7/ > //* then //check whether the precision requirement is satisfied 

18: Adjust batch size K and number of batches p'; 

19: T] \= K ^ [p' -\- d) //p' does not contain the number of discarded batches 

20: else pass := TRUE; 

21: end if 

22: until pass 

23: end procedure 


parameters W (precision requirement) and a (confidence level), this algorithm com¬ 
putes a Cl estimator [Clhottom^CItop] and a point estimator p which together satisfy that 
W < max{ii — Clbottom^CItop ~ the real steady-state probability of the system 

is within 100(1 — a)% probability. 
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4 Parallel Steady-state Analysis of Large PBNs 

As shown in [11], the two above mentioned simulation-based methods work well for 
computing the steady-state probabilities for large PBNs even with a few thousands of 
nodes. However, it often appears that a huge sample size is required in the case of large 
PBNs, which can be computationally expensive. In principle, parallelising the sample 
generation process can be considered an ideal solution to this problem. We propose to 
combine the Gelman & Rubin method with the two above mentioned methods. The 
Gelman & Rubin method is used to monitor that all the simulated chains have approxi¬ 
mately converged to the steady-state distribution while the other two methods are used 
to determine the sample size required for computing the steady-state probabilities of 
the states of interest. In the following part of this section, we first introduce the Gelman 
& Rubin method and then discuss how to parallelise the two methods mentioned in 
Section 3 by using the Gelman & Rubin method. 

4.1 The Gelman & Rubin method 

The Gelman & Rubin method [13] is an approach for monitoring the convergence of 
multiple chains. It starts from simulating Ixj/ steps of CO >2 independent Markov chains 
in parallel. The first if/ steps of each chain, known as the ‘burn-in’ period, are discarded 
from it. The last y/ elements of each chain are used to compute the within-chain (W) 
and between-chain (B) variance, which are used to estimate the variance of the steady 
state distribution (d^). Next, the potential scale reduction factor R is computed with d^. 
R indicates the convergence to the steady state distribution. The chains are considered 
as converged and the algorithm stops if R is close to 1; otherwise, if/ is doubled, the 
trajectories are extended, and R is recomputed. We list the steps of this approach in 
Algorithm 3. For further details of this method and the discussion on the choice of the 
initial states for the CO chains we refer to [13]. 

4.2 Parallelising the two-state Markov chain approach 

To reduce the time cost of the two-state Markov chain approach in the case of large 
PBNs, we propose to parallel this approach by providing samples from multiple chains. 
This is achieved by combing the two-state Markov chain approach with the German 
& Rubin method. The German & Rubin method is used to run multiple chains of the 
original DTMC to assure their convergence to the steady-state distribution. Once con¬ 
vergence is reached, the second halves of the chains are merged into one sample, and the 
two-state Markov chain approach is applied to estimate N based on the merged sample. 
As convergence is guaranteed, no ‘burn-in’ period is considered. The stop criterium for 
the two-state Markov chain approach becomes that the estimated value N is not bigger 
than the size of the merged sample. If the stop criterium is not satisfied, the multiple 
chains are extended in parallel to provide a sample of required length. 

The above idea is based on the assumption that once the simulated chains of the 
original Markov chain have converged, the two-state Markov chain abstraction is also 
converged to its steady-state distribution. To guarantee that this assumption is correct, 
we add an additional step. Once the stop criterium is satisfied, the ‘bum-in’ period 
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Algorithm 3 The Gelman & Rubin method 

1: procedure generateConvergedChains(co, i/aq) 

2 : 

3: Generate in parallel ft) trajectories of length 2\\f\ 

4: repeat 

5: chains(L.. ft),L. . 2y/) := Extend all the ft) trajectories to length 2\\f\ 

6: for/=L.ft) do 

7: l^i ■= mean of the last i/r values of chain /; 

8: Si := standard deviation of the last y/ values of chain /; 

9: end for 

10: 

11: B := HfLi ^ •= ^ HfLi //Between and within variance 

12: <7^ := (1 — ^)W + ^B; //Estimate the variance of the stationary distribution 

13: R\= \J d'^ jW ; //Compute the potential scale reduction factor 

14: yf\=2yf\ 

15: until R is close to 1 

16: return (chains,i/r/2); 

17: end procedure 


Algorithm 4 The Parallelised two-state Markov chain approach 

1: procedure estimateProbabilityInParallel(cu, y / Q ^ s ^ r ^ s ) 

2: {chains^ y/) := generateConvergedChains{co, i/Zq); 

3: n := 0; extend Jby := y/; monitor := EALSE; abstracted sample := NULL; 

4: repeat 

5: repeat 

6: chains := Extend in parallel each chain in chains by extend Jby, 

7: sample := chains{l ... ft), (zz-h i/r-h 1). .. {n-\-yf-\-extendedJ?y))\ 

8: abstracted sample := abstract sample and combine with abstracted sample ; 

9: n\=n-\- extend Jby, samplesize := con; 

10: Estimate a,P from abstracted sample; 

11: Compute N as Line 8 Algorithm 1; 

12: extend Jby := [ {samplesize —N)/(o\, 

13: until ext end Jby < 0 

14: Compute M as Line 8 Algorithm 1; monitor := EALSE; 

15: ifM>i//then 

16: ext end Jby := yf — M; y/ :=M; monitor := TRUE; 

17: end if 

18: until monitor 

19: Estimate the prob. of meta state 1 from abstracted sample; 

20: end procedure 


M of the two-state Markov chain is computed. The assumption is verified true if M 
is not larger than the ‘burn-in’ period y/ of the Gelman & Rubin method. Otherwise, 
additional M — yf elements will be discarded in the beginning of each chain and the 
two-state Markov chain part is re-run on the modified sample. The detailed steps of this 
approach are outlined in Algorithm 4. 
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When analysing a biological system, we are often interested in more than one set 
of states, e.g., in the case of a long-run sensitivity analysis of a PBN used to model 
a biological system. For simplicity, we call the steady-state probability of the states of 
interest as one property and computing this steady-state probability as checking one 
property. Given q different properties, the two-state Markov chain approach needs to be 
run for q times in order to check all of them. Since the generation process is the most 
time consuming part in the algorithm, the time cost for checking multiple properties can 
be reduced significantly if we can reuse the generated samples. We modify Algorithm 4 
to allow the reuse of samples for computing the steady-state probabilities of multiple 
properties. The crucial idea is that the simulated samples are abstracted into different 
meta states based on these different q properties simultaneously each time when an 
extension of chains is finished. The calculations of N for different properties are then 
performed simultaneously as well, resulting in another level of parallelisation. The next 
extension length is determined by the minimal value of all the calculated A^s. Using the 
minimal value would increase the extension times; however it can avoid unnecessary 
abstraction of samples, which is a relatively expensive process. The extension process is 
stopped when all the calculated A^s are smaller than the current sample size. The steady- 
state probabilities of all the properties are then calculated based on their corresponding 
abstracted meta states. 

4.3 Parallelising the Skart method 

The trajectory required by the Skart method can be very long as well in the case of large 
networks. We propose to apply a similar strategy as what we have done for the two-state 
Markov chain approach to reduce the time cost of the Skart method. It is assumed in 
the Skart method that the number of batches p and the batch size K are large enough 
to guarantee that the batch means are approximately i.i.d normal random variables. 
This assumption still holds when the batches are obtained from different chains given 
that convergence has been reached in those chains. Therefore, the Skart method can be 
parallelised by fetching samples from multiple chains which have converged. We use 
the Gelman & Rubin method to guarantee that different chains have converged and the 
Skart method to determine the number of batches and the size of each batch in order to 
estimate the target stationary probability with a given precision. Since the burn-in steps 
are already discarded by the Gelman & Rubin method, no samples will be truncated 
when computing the Cl in the parallelised version of the Skart method. For efficiency 
consideration, we further require that the number of batches obtained from each chain 
is the same. This leads to the fact that the number of batches used in the parallelised 
Skart method is slightly larger than that in the original Skart method. Figure 2 shows 
how a single chain and multiple chains are divided into batches. We summarise the 
parallelisation of the Skart method in Algorithm 5 and highlight the lines where there 
exists a main difference with respect to the sequential method by adding comments. 

5 Evaluation 

We have implemented Algorithm 1 and Algorithm 2 in the tool ASSA-PBN [12] and 
evaluated their performance in our previous work [11]. We show in this section that the 
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Algorithm 5 The Parallelised Skart method 

1: procedure estimateCIInParallel(//*, a) 

2: (chains, y/) \= generateConvergedChains((0,yfQ)\ 

3: Skip first y/ samples of each chain; rj := [1,280/ft)] * ft); pass := EALSE; 

4: Extend all the chains each to length rj/o) if necessary 

5: Compute the skewness B and set batch size K based on B // use all samples 

6 : p \= \r]/co^^co\r] \= K^p; //batch number is adjusted 

7: Extend all the chains each to rj/co and compute randomness test statistics C 

8: while Independence test is not passed do 

9: Adjust batch size K, number of batches p and spacer size d\ 

10: p\= [/?/ft)] * ft); T] ’= K^p //batch number is adjusted 

11: Extend all the chains each to rj/co and compute randomness test statistics C 

12: end while 

13: repeat 

14: Extend all the chains each to length rj/co if necessary 

15: Compute nonspaced grand batch mean p and variance estimator Var 

16: Compute skewness and autogression adjusted Cl based on Var and a 

17: H = max{n - Clhottom, Chop - 

18: ifff>ff*then 

19: Adjust batch size K and number of batches p; 

20: p := Ip/ co] ^ co; rj := K^p //do not consider discarding the first d samples 

21: else /?a55:=TRUE 

22: end if 

23: until pass 

24: end procedure 


proposed two parallel algorithms can significantly reduce the time cost for computing 
steady-state probabilities of large PBNs in comparison with their sequential versions. 
We evaluate this first on randomly generated PBNs and then on a PBN from a real bio¬ 
logical system. All the experiments in this paper are conducted in a high-performance 
computing (HPC) node, which contains 16 Intel Xeon E7-4850 processors @2GHz. The 
16 processors are equally distributed in 4 Bull S6030 boards (servers) and each pro¬ 
cessor contains 10 cores. This hardware architecture allows us to run a program with 
maximum 40 cores in one board. 

ASSA-PBN was implemented in Java and the initial and the maximum Java Heap 
Size were set to 2GB and 64GB, respectively. All the experimental data are available at 
http://satoss.uni.lu/software/ASSA-PBN/. 

5.1 Speed-up for checking a single property 

We first evaluate the speed-up for checking a single property using the parallelised 
algorithms, i.e., Algorithm 4 and Algorithm 5. We randomly generate 18 different PBNs 
using the tool ASSA-PBN. ASSA-PBN can randomly generate a PBN which satisfies 
structure requirements given in the form of five input parameters: the node number, 
the minimum and the maximum number of predictor functions per node, finally the 
minimum and the maximum number of parent nodes for a predictor function. The 18 
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PBNs are generated with node numbers from the set {80,100,200,500,1000,2000}. 
For each node number, 3 PBNs are generated. We assign the obtained PBNs into three 
different classes with respect to their density measure dense models with density 
150 — 300, sparse models with density around 10, and in-between models with density 
50 — 100. The precision and confidence level of all the experiments are set to 10“^ and 
0.95, respectively. The parameter £ in the two-state Markov chain approach is set to 
10“^^. We compute steady-state probabilities for the 18 PBNs using Algorithms 1, 2, 4, 
and 5 and compare their results as well as time costs. The parallelised algorithms are 
launched with 6 different number of cores in one board ranging in {2,5,10,20,30,40}. 

As the models we use are too large to be dealt with numerical methods, it is not 
possible to check the correctness of the parallelised algorithms using the models’ theo¬ 
retical steady-state probability distributions. Instead we compare the results of the par¬ 
allelised algorithms with their corresponding sequential algorithm. We collect the two 
probabilities computed by Algorithms 1 and 4 or by Algorithms 2 and 5 for checking 
the same property of one model as a pair. In this section, there are 216 pairs of prob¬ 
abilities in total. We expect the difference of the two probabilities in a pair to be less 
than 2 x 10“"^ with the probability of 95%. In all the 216 pairs that we have obtained, 
the difference is always less than 2 x 10“"^. Moreover, we perform a similar verification 
for the evaluation results in Section 5.2 and obtain the same observations. 

Figure 3a and Figure 3b present the speed-ups when analysing a single property 
of the 18 PBNs with the parallelised two-state Markov chain approach. Each speed¬ 
up is computed as t sequential /t parallel, where t sequential IS the time cost of the sequential 
two-state Markov chain approach and tparallel is the time cost of its parallelised ver¬ 
sion. The parallelised approach is launched with different number of cores ranging in 
{2,5,10,20,30,40}. We see clearly from these figures that the speed-up is almost pro¬ 
portional to the number of cores. Meanwhile, the speed-ups vary a lot in different mod¬ 
els with 40 cores, e.g., we observe a speed-up about 46 in the case of the 2000-node 
dense model and a speed-up of 22 in the case of the 200-node sparse model. 

On the one hand, the required sample size varies in different runs due to the nature 
of the two-state Markov chain approach. The speed-up can be bigger than the number 
of cores when the required sample size in the parallelised run is smaller than what is 
required in the sequential run - this is actually the case where we obtain the speed-up of 
46 for the 2000-node dense model. On the contrary, the speed-up can be much smaller 
than the number of cores when the required sample size in the parallelised run is bigger 
than that required in the sequential run - this is the case where we obtain the speed-up of 
31.7 for the 80-node dense model. To show the affection of sample size, we compute the 
speed-ups for the parallelised run with 40 cores after eliminating the affection of sample 
size using the formula sp^ size^jsizes, where sp is the original speed-up computed with 
isequential/iparallel, ^izCs is the Sample size used for the sequential run, and sizep is the 
sample size used for the parallel run. The results are shown in Table l(in the rows 
labelled with speed-up E). On the other hand, the time cost also varies when checking 
a property for different models. When the time cost of the sequential run is small, the 
percentage of time spent in generating samples is also small. As a consequence, the 
percentage of time the parallelised algorithm can reduce is small as well. Therefore, 
the speed-up the parallelised algorithm can gain is small when the time cost of the 
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(a) The two-state Markov chain approach (I): 



(b) The two-state Markov chain approach (II): 
PBNs with 500, 1000, 2000 nodes. 



(c) The Skart method (I): PBNs with 80, 100, 
200 nodes. 


(d) The Skart method (II): PBNs with 500, 
1000, 2000 nodes. 


Fig. 3: Speed-up of the parallelised algorithms. 


sequential run is small - this is the case where we obtain the speed-up of 22 for the 
200-node sparse model. On the contrary, a larger speed-up is easy to obtain when the 
time cost of the sequential run is big - this is the case where we obtain the speed-up of 
46 for the 2000-node dense model. We obtain similar speed-ups with the Skart method 
and the speed-ups are presented in Figure 3c and Figure 3d. 

Besides, we obtain maximum speed-ups with the use of 40 cores under the current 
hardware condition. To illustrate this, we show in Table 1 more detailed information, 
i.e., the time costs (in minutes), the actual sample sizes (of millions), the speed-ups, 
and the speed-ups after eliminating the affection from the sample size. Note that in 
order to make the results as accurate as possible, all speed-ups are computed using the 
original time and size values we get from experiments, not the approximate ones shown 
in Table 1. For the two-state Markov chain approach, the speed-ups are greater than 30 
for 14 out of 18 cases and for the Skart method, the speed-ups are greater than 30 for 
12 out of 18 cases. 

Moreover, we show in the next section that with the use of 40 cores, speed-ups 
between 19.67 and 33.04 are obtained for a 96-node PBN from a real biological system. 
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node # 

80 

100 

200 

density 

sparse 

in-between 

dense 

sparse 

in-between 

dense 

sparse 

in-between 

dense 

two-state 

time 

(m) 

seq. 

27.4 

50.7 

30.6 

30.4 

52.5 

119.5 

18.0 

159.3 

171.0 

par. 

1.1 

1.4 

1.0 

1.3 

1.5 

3.0 

0.8 

4.6 

4.8 

size 

(million) 

seq. 

84.7 

105.3 

41.6 

70.2 

98.8 

117.7 

25.9 

134.6 

83.7 

par. 

85.2 

105.0 

49.3 

70.3 

98.8 

117.5 

25.9 

134.7 

83.7 

speed-up 

24.3 

37.5 

31.7 

23.9 

34.5 

40.0 

22.1 

34.7 

35.8 

speed-up E 

24.5 

37.4 

37.7 

23.9 

34.5 

39.9 

22.2 

34.7 

35.8 

Skart 

time 

(m) 

seq. 

29.9 

49.2 

31.7 

30.5 

47.6 

110.0 

16.7 

167.3 

176.9 

par. 

1.1 

1.9 

1.0 

1.1 

2.2 

3.8 

0.9 

5.4 

4.4 

size 

(million) 

seq. 

94.1 

110.2 

43.5 

72.7 

89.6 

110.1 

24.5 

141.2 

86.9 

par. 

78.3 

109.7 

40.8 

69.6 

109.7 

123.5 

29.1 

131.1 

73.8 

speed-up 

28.2 

25.5 

32.3 

28.9 

21.3 

28.9 

17.8 

31.0 

39.8 

speed-up E 

23.4 

25.4 

30.3 

27.7 

26.1 

32.5 

21.1 

28.8 

33.8 

node # 

500 

1000 

2000 

density 

sparse 

in-between 

dense 

sparse 

in-between 

dense 

sparse 

in-between 

dense 

two-state 

time 

(m) 

seq. 

302.5 

556.9 

410.8 

211.3 

620.4 

1841.7 

111.0 

460.5 

643.1 

par. 

9.3 

15.9 

11.4 

7.2 

20.3 

52.6 

3.2 

14.0 

14.0 

size 

(million) 

seq. 

156.1 

198.3 

113.9 

75.8 

176.7 

297.6 

44.8 

114.1 

115.5 

par. 

153.8 

204.8 

114.0 

75.6 

176.4 

302.1 

40.8 

113.5 

91.8 

speed-up 

32.7 

34.9 

36.2 

29.5 

30.6 

35.0 

34.9 

32.8 

45.8 

speed-up E 

32.2 

36.0 

36.2 

29.5 

30.5 

35.6 

31.8 

32.6 

36.4 

1 Skart 

time 

(m) 

seq. 

278.5 

594.0 

394.2 

218.7 

671.3 

2095.5 

98.9 

467.5 

466.9 

par. 

9.1 

15.9 

11.7 

6.0 

19.5 

51.0 

3.2 

11.7 

13.0 

size 

(million) 

seq. 

144.1 

216.5 

114.9 

78.7 

185.8 

292.9 

40.1 

109.2 

94.1 

par. 

134.9 

208.6 

117.4 

75.3 

184.4 

274.0 

38.4 

103.3 

86.4 

speed-up 

30.6 

37.4 

33.6 

36.3 

34.5 

41.1 

31.3 

39.8 

35.8 

speed-up E 

28.7 

36.0 

34.3 

34.7 

34.3 

38.5 

29.9 

37.7 

32.9 


Table 1: Speed-ups using 40 cores for Algorithm 4 and Algorithm 5. 


5.2 Speed-up for checking multiple properties 

We have performed one influence analysis and two long-run sensitivity analyses of an 
apoptosis network using the sequential two-state Markov chain approach in [11]. The 
apoptosis network contains 96 nodes; one of the nodes, i.e., UV, can take on three values 
and was reflned as UV(1) and UV(2) in order to cast its original model into the binary 
PBN framework. The 96 nodes with 107 Boolean functions and their parameters, i.e., 
the selection probabilities of Boolean functions, were fitted to experimental data in [10]. 
We took the 20 best fitted parameter sets and performed our analyses for them. With an 
efficient implementation of a PBN simulator, we managed to finish this analysis in an 
affordable amount of time. Nevertheless, the analysis was still very expensive in terms 
of computation time since the trajectories required were huge and we needed to check 
a number of properties. 

In this work, we re-perform part of the influence analyses by running the paral¬ 
lelised two-state Markov chain approach for checking 7 properties simultaneously with 
40 cores. In the influence analysis, we aim to compute the long-term influences on com- 
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sequential 

parallelised 

speed-up 

property # 

sample size 
(million) 

time (m) 

property # 

sample size 
(million) 

time(m) 

1 

146.99 

53.30 

1 

147.20 

2.31 

23.04 

2 

454.14 

174.25 

2 

461.58 

6.81 

25.58 

3 

253.45 

97.77 

3 

253.60 

3.86 

25.32 

4 

48.81 

16.71 

4 

50.11 

0.73 

22.83 

5 

305.35 

120.33 

5 

335.85 

5.38 

22.39 

6 

50.21 

17.65 

6 

51.31 

0.90 

19.67 

7 

255.17 

99.75 

7 

263.39 

3.02 

33.04 

total 

1563.05 

579.75 

1-7 

452.74 

10.96 

52.88 


Table 2: Performance comparison on multiple properties with the two-state Markov 
chain approach. 


plex2 from each of its parent nodes: RIP-deubi, complex 1, and FADD, in accordance 
with the definition in [18]. We consider both the case of UV(1) and UV(2) and hence we 
construct 2 PBNs for each of the 20 best fit parameter sets. In total, we need to compute 
7 different steady-state probabilities for 40 different PBNs. 

Previously, we have applied the two-state Markov chain approach 280 times to fin¬ 
ish the computation. Using the parallelised version, we only need to perform the paral¬ 
lelised two-state Markov chain approach 40 times since 7 properties for one PBN can 
be checked in one run. In this evaluation, we perform the parallelised two-state Markov 
chain approach to check the 7 properties of one of the 40 PBNs simultaneously and 
show in Table 2 the time cost (in minutes), the actual sample size (of millions) we use 
and the speed-ups we obtain for checking them with the sequential and parallelised al¬ 
gorithms. To make the comparison complete, we also perform the parallelised two-state 
Markov chain approach to check the 7 properties one by one and show the results in 
Table 2. The precision r, confidence level 5', and steady-state convergence parameter 
£ in this experiment are set to 10“^, 0.95 and 10“^^, respectively. The speed-ups for 
checking a single property are computed similarly as in Figure 3; while the speed-up 
for checking the 7 properties is computed with ^ 4=1 where ti is the time cost for 

checking the i-th property with the sequential algorithm and t^uiti is the time cost for 
checking the 7 properties simultaneously with the parallelised algorithm. From Table 2, 
the parallelised algorithm obtains a speed-up between 19.67 and 33.04 for checking a 
single property and a speed-up of 52.88 for checking 7 properties. By reuse of generated 
samples, the sample size is also reduced by 3.45 times from 1563.05 to 452.74 million. 


6 Conclusion and Future Work 

In this paper, we proposed to combine the German & Rubin method with two statistical 
methods, i.e., the two-state Markov chain approach and the Skart method, to reduce 
the time cost for computing steady-state probabilities of large PBNs. We showed with 
experiments that the proposed combinations can reduce the time cost of the original 
sequential methods significantly. 
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Our parallelised algorithms work well on multiple-core CPU architecture. However, 
the scalability of multiple-core CPU based parallelisation is often restricted by the CPU 
architecture since the number of processing units in a CPU is usually small. On the 
contrary, GPUs often contain thousands of processing units and GPUs achieve high 
performance when thousands of threads execute concurrently [19]. Parallelising those 
algorithms with GPU based architecture will potentially lead to further speed-ups. 
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